158

13

Useful Algorithms

with r Subscript nrn, and the problem is solved. This example is a well-defined procedure that

leads automatically to the desired result.

An operation frequently required in bioinformatics is sorting a collection of items

(an array of elements), implying arranging them in increasing (or decreasing) order.

The so-called bubble sort (elements “float” to the top of the array) is considered to

be the simplest sorting algorithm. Each element is compared pairwise to each other.

If a pair is found to be in the incorrect order, the two elements are interchanged. The

algorithm is based on two DO-loops (repeat the instructions within the loop for a

preset number of times, or until some condition is fulfilled), one nested inside the

other. The outer loop runs from 1 to1 minus left parenthesis1(the length of the array), and the inner loop

runs from 1 plus left parenthesis1 + (a counter of the outer loop) up to the length of the array.

This sort algorithm is not particularly efficient, in the sense that algorithms requir-

ing fewer instructions to accomplish the same task are available. Often these more

efficient algorithms take more time to program, however. For example, the fast

Fourier transform does indeed require significantly fewer instructions than the ordi-

nary Fourier transform, but nowadays, with the almost universal availability of per-

sonal computers, provided the dataset being transformed is not too large, the extra

work of programming might not be worth the bother. Most personal computers are

switched off at night when they could actually be calculating. The Intel Pentium chip,

introduced around 1994, can carry out more than 100 MIPS (million instructions per

second); this is 5 times faster than the 486 chip, introduced around 1992, and 100

times more than the mainframe DEC VAX 780, introduced in 1977, and for more

than a decade the workhorse of many computing centres; the Cray 1 supercomputer

(1975) could already execute 160 MIPS. The DEC PDP1, introduced around 1960

and also very widely encountered in its day, carries out 0.1 MIPS. IBM’s Deep Blue,

introduced in 1996, can accomplish 10 Superscript 6106 MIPS. Current processors such as the top

of the range Intel Core i7 4770k operate at nearly 130,000 MIPS, while the Blue

Gene Q supercomputer with thousands of processors is capable of 20 petaFLOPS

(i.e.,2 times 10 Superscript 162 × 1016 floating point operations per second which, depending on architecture,

could be many tens of MIPS). The newer Intel chips found in laptops are very pow-

erful: the core i9 (2018) achieves about 410,000 MIPS. When computing jobs were

processed batchwise on a mainframe device there was, of course, strong pressure to

achieve operations such as sorting and matching with as few instructions as possible;

but when the ubiquitous personal computer has 100 times more processing power

than a VAX 780, the effort of achieving it may be considered superfluous by all who

are not professional programmers.

Problem. Write a program to implement the bubble sort algorithm in a high-level

computer language.

Problem. Write an algorithm for searching for all occurrences of a particular word

(a substring) in a string and returning the distance of each occurrence from the start

of the string.